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The exact evaluation of the disconnected diagram contributions to the fiavor-singlet pseudo-scalar 
meson mass, the nucleon a-term and the nucleon electromagnetic form factors is carried out utilizing 
^sj| ] GPGPU technology with the NVIDIA CUDA platform. The disconnected loops are also computed 

using stochastic methods with several noise reduction techniques. Various dilution schemes as well 

■ as the truncated solver method are studied. We make a comparison of these stochastic techniques 
fS) I to the exact results and show that the number of noise vectors depends on the operator insertion in 

^ , the fermion loop. 

D 

I. INTRODUCTION 

An accurate estimate of disconnected contributions to flavor singlet quantities remains one of the most computa- 
_ . tional demanding problems in applying lattice Quantum Chromodynaniics (QCD) to the study of hadron physics. The 
determination, for instance of the strange content of the nucleon, which has been extensively studied both experimen- 
] tally and theoretically, requires the evaluation of fermion loops. A nice example of an analysis that combines 
'— lattice results and chiral expansion techniques to determine the strange magnetic and electric form factors of the 
nucleon is presented in Refs. The exact evaluation of disconnected diagrams is extremely difficult because one 

needs to calculate the all-to-all propagators. Furthermore, the gauge noise for some of these disconnected diagrams 
' dominates the signal and a large number of statistics is required to reduce the error. To avoid performing all the 
, inversions required for an exact evaluation of the all-to-all propagators, the standard approach is to use stochastic 

■ techniques with a variety of dilution schemes to estimate them. Such techniques have been applied recently in the 
] evaluation of the rj' mass, the nucleon cr-term and the electromagnetic form factors and the hadronic contribution to 

■ 3 — 2 (8l-[l3|. The aim of this work is to evaluate a representative set of disconnected loops exactly and compare to 
I the routinely used stochastic techniques in order to benchmark the various approaches. Recently one utilizes special 

T-H . hardware accelerators, such as graphics processors, to speed-up the inversions themselves list . The exact eval- 
' uation is thus carried out using Graphics Cards (GPUs) to efficiently calculate the all-to-all propagator. Since the 
^ , purpose of this work is to benchmark the various methods rather than produce state-of-the-art results, we use gauge 
configurations generated by the SESAM Collaboration on a relatively small lattice size of 16^ x 32 in order to 

facilitate the exact evaluation of the fermion loops. We examine various fermion loops that enter into the evaluation 
of observables that have been recently studied on the lattice. These are the mass of the 77' and the nucleon scalar and 
electromagnetic (EM) form factors. 
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II. LATTICE ENSEMBLE AND SIMULATION PARAMETERS 



For this exploratory study we use gauge configurations generated by the SESAM/T^L Collaboration using Nf — 
2 Wilson fermions at /3 = 5.6 and hopping parameter k = 0.157. This corresponds to a pion mass of arriTr = 
0.3452(29) [17j- In order to convert to physical units we follow Ref . IITII and use the Sommer parameter, rg, defined 
through the force between two static quarks at intermediate distance jl8j |. The value taken from Ref. [13] is tq — 0.5 fm 
and at K = 0.157 it yields for the inverse lattice spacing = 2.16(3) GeV, giving m^r = 746 MeV. A total of 165 
configurations are analyzed. 

The general form of a disconnected loop is given by 

L{x) ^Tr[rG{x;x)], (1) 
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where for F we consider F = 1, 75, and 7^ and G{x; x) is the Dirac propagator. The main question that we would like 
to address is which type of stochastic technique most efficiently reproduces the exact result and whether an optimal 
method exists independently of the F insertion in the loop. It is clearly seen from the form of Eq. ([1} that, in order 
to evaluate quantities that involve the spatial sum of L{x), one needs spatial volume more inversions than the point- 
to-all propagator. In this work, we evaluate exactly the all-to-all propagators for the particular set of parameters, 
given above. This is clearly very computational intensive and it is therefore beneficial to take advantage of graphics 
accelerators. We employ the QUDA library, which provides mixed precision implementations of conjugate gradient 
(CG) and BiCGstab solvers for the NVIDIA CUDA platform, in order to evaluate the all-to-all propagators [T9| . 
The exact result then provides a benchmark at the level of gauge-noise for the quantities with contributions from 
disconnected loops. We note that GPUs are used for performing all the inversions including those for the stochastic 
evaluation. We would like to point out that increasing the number of configurations to reduce the gauge noise is 
expensive since this would require evaluating more all-to-all propagators. Therefore in this work we limit ourselves to 
165 gauge configurations. 

The stochastic estimate of the disconnected quark loops is performed using complex Z2 noise for the source vectors 
in combination with several dilution schemes and the truncated solver method ^9|]. Specifically, we consider space, 
color and spin dilution schemes. Color (spin) dilution requires three (four) times more inversions as compared to 
the number with no dilution, whereas even-odd partitioning of the space doubles that number. In addition to an 
even-odd dilution, we have also applied a cubic dilution, where separate sources are placed on each vertex of an 
elementary 3-dimensional cube and repeated throughout the lattice, leading to an increase of a factor of 8 in the 
number of inversions. The truncated solver method effectively partitions the problem into a low precision and high 
precision space A large number of low precision inversions are carried out to achieve an approximation to the 
propagator with low stochastic error (but only accurate to low precision). A high precision stochastic correction is 
then applied using a small ensemble with the corresponding inversions carried out to high precision. The size of the 
stochastic ensemble of noise vectors for the low precision space and the corresponding ensemble of noise vectors for 
the high-precision correction is examined for the various loops. Time dilution is applied in all cases and we exploit 
translational invariance in order to limit the number of time slices for which the exact evaluation of the fermion loops 
is required. 

The stochastic evaluation schemes are based on creating an ensemble of noise-vectors with the properties 

■^T.Q(^)r = {C,{x))r « (2) 
^ r— 1 

and 

1 , 

^EC(^)'-C' (^'). = iQi^K' {^'))r - S{x-X')5,,,6aa' ■ (3) 
^ r—1 

Using the above properties one has 

v' 

= E ^"-'(^5 y')^iy - y'^-^^'Sw = Gf,{x- y) , (4) 
y' 

where 4> is the solution vector corresponding to a source with noise vector ^. The above equation provides an 
approximation to the exact all-to-all propagator because the property of the noise vectors f given in Eq. ^ is exact 
only for an infinite ensemble of vectors. In practice, one takes a large number of noise vectors and tests the stability of 
results when increasing the number of vectors. In this work, we will determine how large the number of noise vectors 
should be, in order to obtain the exact result. 

Using the stochastic estimate of the all-to-all propagator the disconnected loop is written as 

L{x,to) = -^E^r(^:io)rra^j0)3(f,io)r- (5) 

^ r 

Given that Eq. ([5]) provides an estimate to the exact result the question is whether the size of the noise vector ensemble 
depends on the type of F matrix in the loop. 

In the following sections we discuss the results according to the different choices of the F-matrices entering in the 
loop. 
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III. EVALUATION OF THE r?' MASS 

A general non-flavor singlet meson correlator CAB{p,t) can be written as 

CAB{P,tf-U) = ^ (JA(xf,t/)jjj(x,,i,)) 

= ;^ ^^^'''■*"'"""^Tr[r^G2(x/,i/;x„t0rBGi(x„i,;x/,t/)]), (6) 

X/,Xi 

where the interpolating field JAix) — qi^Aq2 for different quark flavors qi and 52, and Gj{x' , t'; x, t) is the propagator 
of quark of flavor j from space-time point (x, t) to space-time point (x', t'). Spin and color indices are suppressed. For 
flavor singlet mesons such as the rj' , besides the connected part given by Eq. |B1 there are disconnected contributions 
to the correlation function, which are given by 

DAB{p,tf~U) = -y^ 51 <Tr [e'P-^frAG2{^f.tf;^f,tf)] Tr [e-*P-"TBGi(x„i,;x„t,)])- (7) 



x/,Xi 



Smearing is routinely used to decrease overlap with excited states. In this work, in addition to local, we consider 
Gaussian smeared quark flelds [1^ [2l| for the construction of the interpolating flelds: 

g,".„_(i,x) = ^F«''(x,y;C/(0) <z''(i,y), (8) 
y 

F = 

3 

H{x,y;U{t)) = ^[C/,(a;)4,j,_,~-t-[//(x-i)4,j/+d- 

i=l 

In addition, we apply APE-smearing to the gauge fields C/^ entering the hopping matrix H. All forward point-to-all 
propagators are computed by applying smearing taking the values of the smearing parameters a = 4.0 and n = 50. 
These values were determined by optimizing ground state dominance for the nucleon (22| . The loops are computed 
without smearing throughout, including those involved in the computation of the 77' mass. Since the purpose of this 
work is to compare exact results with those using various stochastic approaches we did not repeat the evaluation of 
the exact loops with smearing. 

For the particular case of the rj' meson and since we are using an Nf = 2 gauge ensemble, there are no strange 
quark contributions. Therefore the flavor singlet pseudo-scalar meson (also denoted by 772 in the Nf — 2 theory) has 
only contributions from the light quarks and its two-point correlator can be written as 

C^,{t) = CAt)~2D{t), (9) 

where we have taken p = and dropped the flavor indices /i and f2- 

For mesons on lattices with periodic boundary conditions the two-point correlation function can be written as 
C{t) ^ e^™* -I- Q-^-iT-t) ^ ig^j-gg a,nd t <^T, where T is the lattice temporal extent. We can therefore analyze the 
ratio of the disconnected quark loop, D{t), and connected correlation function, G7r(i), to extract the flavor-singlet 
pseudo-scalar meson mass: 

where ttItt and m^' are the masses of the tt and 77' mesons and A, B are additional fit parameters. The pion mass 
771^ can be determined separately by fitting the pion correlator to the 1% level and used in Eq. [10] leaving only 3 
parameters in the fit function. Adopting this approach allows one to use independent smearing of the connected and 
disconnected loops. 

In the case of the exact evaluation, the only source of error comes from the statistical error of the gauge ensemble, 
and therefore we will employ this fact to assess the results obtained using the different stochastic methods with color, 
spin, even-odd and cubic dilution. In addition, we compare with the truncated solver method where for the low 
precision inversions using BiCGstab we set the relative deviation i.e. \Mxi — to 10~^ and for the high precision 
we set it to 10~*, where M is the Wilson Dirac operator, Xi the solution vector after i iterations and h the source 
vector. 
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FIG. 1: Left panel: The error on the disconnected part of the r\ correlator -D(t) at tja = 3 as a function of the number 
of inversions. The line shows the statistical error. Right panel: The disconnected part of the rj correlator D{€) at t/a = 3 
computed stochastically as a function of the number of inversions. The lines show the mean value and error band of the exact 
result for T)(€) at the same time slide. In both graphs we show, from top to bottom, results using: color, spin, even-odd and 
cubic dilution. 
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FIG. 2: In the left panel we show the error on the disconnected and on the right the disconnected part of the r( correlator 
Dit) at t/a = 3 using the truncated solver method as a function of the number of low precision noise vectors for: 10 (top), 50 
(middle) and 120 (bottom) high precision noise vectors. The lines show the mean value and error band of the exact result for 
D{i) at the same time slide. 



In Fig.[T]we show results for the disconnected contributions D(t) at t/a = 3 as a function of the number of stochastic 
inversions using color, spin, even-odd and cubic dilution. In our case the exact result is known and, since all results 
are computed on the same gauge configurations, the stochastic evaluation should coincide with the result obtained 
from the exact evaluation of the all-to-all propagators in the limit of large enough noise vectors. Therefore, for the test 
case examined in this work, we show not only the error in the stochastic evaluation but also the mean value, which 
should coincide with that of the exact evaluation. Using a different set of gauge configurations will only reproduce 
the result within the statistical error and therefore in practice one requires that the error obtained in the stochastic 
evaluation converges to the gauge error Q . As can be seen in Fig. [1] the stochastic error remains almost unchanged 
after about 600-800 inversions for the four dilution schemes used. The stochastic approach reproduces the mean value 
of the exact result as defined on a given set of gauge configurations and shown by the error band, in the limit of a 
large number of noise vectors. From this comparison we also conclude that the even-odd and cubic dilution schemes 
behave very similarly and therefore in what follows we will show results only for even-odd dilution, which is most 
commonly used. 

In Fig. [5] we show results for D{t) at the same time slice, namely t/a = 3 as used for the results in Fig. [1] but 
this time obtained with the truncated solver method. We display results as a function of the number of low accuracy 
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FIG. 3: The ratio R{t) 



computed using the truncated solver method and the exact approach. With the filled (red) 



squares we show the exact calculation and with the filled (green) circles, the filled (blue) triangles and the filled (magenta) 
rhombus when using 10, 50 and 120 high precision noise vectors, respectively. 
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FIG. 4: The mass of 77' as a function of computational cost (high precision inversions). The exact result is shown with the 
filled (red) squares and the results of the stochastic truncated solver method with the filled (green) circles as a function of the 
number of high precision vectors. 



inversions increasing consecutively the number of high precision noise vectors. Since the low precision is set to relative 
precision 10~^ we only need about 10 BiCGStab iterations as compared to about 150 iterations for the high precision. 
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As the number of HP vectors increases the stochastic error converges to the statistical one faster. However, for 10 HP 
vectors, although the stochastic noise converges, the mean value remains smaller than the exact result. Therefore, one 
has to increase the HP vectors until also the mean value stabilizes. As can be seen, for 50 HP vectors the mean value 
and the error have converged when the number of low precision vectors is above about 2500 i.e. when the number of 
high precision noise vectors is a few percentage the number of low precision vectors. This corresponds to a cost of 
about 215 HP inversions as compared to about 600 for the other stochastic methods. Therefore, the truncated solver 
method is by far the most efficient in reproducing the exact result for the case of the loops with a 75 insertion. 

Having established the preferred stochastic method for the evaluation of the disconnected part we turn to the 
determination of the f]' mass in this Nf = 2 theory. Given that is determined to the 1% level it is clear that the 
gauge noise, due to the disconnected loops, is large. 

In Fig. [3] we compare the exact result for the ratio of disconnected to connected to the results obtained using 
the truncated solver method. As already mentioned, the ratio method allows us to use different smearing for the 
disconnected and connected. We have therefore considered smeared interpolating fields for the connected parts. As 
can be seen, the stochastic results converge when 50 high precision noise vectors are used. Fitting the ratio R{t) from 
t = 2a until t = 8a we extract the mass of the rj' shown in Fig. |4l As can be seen, increasing the number of high 
precision noise vectors above 50 leaves the error and mean value unchanged. Therefore the truncated solver method 
reproduces with low cost the the exact result. This enables computation of the loop at all time slices unlike in the 
case of the exact result where we limit ourselves to 8 time slices. The value we extract for the mass of the rj' is 
amjj' — 0.54(10) or rriw = 1.17(22) GeV slightly higher than the value estimated by the SESAM collaboration using 
these configurations [23| . Recently, the mass of the 772 meson was studied using Nj — 2 twisted mass fcrmions [8] . 
Within a pion mass range of about 500 MeV to about 300 MeV the dependence of the 772 mass on the light quark 
mass was shown to be mild and extrapolating to the physical point a value of 0.865(65) (65) GeV was obtained. Our 
value is thus in reasonable agreement taking into account the higher light quark mass used in this study. 



IV. NUCLEON ELECTROMAGNETIC FORM FACTORS 



In order to extract the nucleon electromagnetic form factors we need to evaluate the nucleon matrix element 
{N{p' , s')\jfj,\N(ji, s)), where \N(p' , s')), \N(p,s)) are nucleon states with final momentum p' and spin s' , and initial 
momentum p and spin s. The nucleon electromagnetic matrix element for real or virtual photons can be decomposed 
in terms of the Dirac and Pauli form factors Fi and F2 respectively: 



N{p\s') \ f\N{p,s))^u{p',s') 



ZlTlN 



u{p,s), (11) 



where q^ = (p' —pY', is the nucleon mass and i?jv(p) its energy. -F'i(O) = 1 for the proton and zero for the neutron 
and ^2(0) measures the anomalous magnetic moment. Fi and F2 are connected to the electric, Ge, and magnetic, 
Gm, Sachs form factors by the relations 

GE{q^) = F,{q^) + -^F2{q^) 

GM(g') = Fi(g2)+F2(g2) . (12) 

An interpolating field for the proton is given by 

J{x) = e'''"'[u'''^ix)C-/5d\x)]u''{x) . (13) 

As described in section HI, in order to increase the overlap with the nucleon state and decrease overlap with excited 
states we use Gaussian smeared quarks with APE smeared links. 

In order to extract the nucleon matrix element of Eq. (|11[) we need to calculate the two-point and three-point 
functions in Euclidean time defined by 

G{p,tf)^J2 e-'''^-^r^o"{U^f,tf)Jf,{O,0)) (14) 
G^(r.,q,0=5] e'-"ir^(J„(x^,t/)j^(x,i)JMO:0))> (15) 

x,x/ 

where Fq and F^ are the projection matrices: 

To = ^(1+74), ^k=iTol5lk- (16) 



7 





q = p -p 



(Xi,ii) 




(x,:,ii) 



FIG. 5: Left: Connected nucleon three-point function. Right; Disconnected nucleon three-point function. 



The kinematical setup that we used is illustrated in Fig. [S] The creation (source) operator at time ti=Q has fixed 
spatial position Xi=0. The annihilation (sink) operator at a later time tf carries momentum p'=0. The current 
couples to a quark at an intermediate time t and carries the momentum q. Translation invariance enforces q = — p 
for our kinematics. The form factors are calculated as a function of = —q^ > 0, which is the Euclidean momentum 
transfer squared. Provided the Euclidean times, t and tf — t are large enough to filter the nucleon ground state, the 
time dependence of the Euclidean time evolution and the overlap factors cancel in the ratio 



i?^(r,q,t) 



G^{r,q,t) G{p,tf-t)G{0,t)G{0,tf) 
G{0,tf) y G{0,tf~t)G{p,t)G{p,tf)' 



yielding a time-independent value 



lim lim i?''(r,q,i) 



(17) 



(18) 



We refer to the range of t-values where this asymptotic behavior is observed within our statistical precision as 
the plateau range. For this study, we use the local electromagnetic current j'^(a;) = 'tfj{x)'y^ip{x) and take the 
renormalization constant Zy from Ref . [l^l ■ We can extract the two Sachs form factors from the ratio of Eq. ([T7]) by 
choosing appropriate combinations of the direction ^ of the electromagnetic current and projection matrices F. 

Inclusion of a complete set of hadronic states in the two- and three-point functions leads to the following expressions, 
written in Euclidean time: 



n^=^(Ffc,q) 



1 



2mN 



(19) 



n^=^(Fo,q) =c 



2raN 



Ge{Q^) 



(20) 



n'^="(ro,q) 



c 



Em + ruN 
2mN 



Ge{Q^) , 



(21) 



where C ■ 



2ml 



is a kinematical factor connected to the normalization of the lattice states and the two-point 



EN{EN+mN) 

functions entering in the ratio of Eq. (fT7|) (25| . 

As schematically shown in Fig. [Sj the nucleon three-point function can be written in terms of a connected and 
a disconnected diagram. The connected diagram can be evaluated via the standard approach of computing the 
sequential propagator through the sink. The polarized matrix element given in Eq. (jl9p . from which the magnetic 
form factor is determined, requires an inversion for each ji in order that we can calculate the matrix element for all 
momenta q in a symmetric way. For the small lattice and heavy pion mass that we have in this study, this can be 
done very fast. The goal of this work is the computation of the disconnected part, given by 



irr[7^G(x,x)] X 



{Jaixf)\j''{x)\Jp{xi)) Disc. 

e-b^e^'b'^'{CTo)xAC75)x'.'Glt {xf,x,) (g-^ {xf,x,)Gli (^/' " ^Tx' {xf,x.,)G^^;{xf,x,) 



(22) 
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As shown diagrammatically, this disconnected contribution consists of the fermion loop muhiphed by the nucleon two- 
point function. Using Eq. (1151) one sees that one needs to perform the sum over the spatial coordinates of the current 
in the fermion loop in order to obtain the nucleon matrix element, requiring knowledge of the all-to-all propagator. 
Therefore the complexity lies in the evaluation of the disconnected loop. 
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FIG. 6: On the left we show the error and on the right the ratio given in Eq. (|17p for the disconnected diagram using spin 
dilution. The lines indicate the exact value with its error band. We show from top to bottom results: for 71 and p = (1,0,0); 
72 and p= (0, 1, 0); 73 and p= (0, 0, 1); 74 and p = (1, 0, 0). In all cases the projection matrix To is used and tf — t = 4a. 
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FIG. 7; On the left we show the error and on the right the ratio given in Eq. (|17|) for the disconnected diagram using the 
truncated solver method versus the number of low precision noise vectors. The number of high precision noise vectors used are 
1240 for which convergence is reached. The notation is the same as that in Fig. [G] 

In Fig. [S] we compare exact results for the disconnected diagram contributing to the ratio defined in Eq. (1171) for 
various operator insertions with results obtained using spin dilution. The behavior of the other stochastic methods, 
i.e. using color and even-odd dilution, is similar to that of spin dilution and therefore they are not shown. As can be 
seen, the number of stochastic vectors needed for convergence is large. Even with about 25% the cost of the exact 
evaluation the results have not fully converged. Therefore these stochastic dilution schemes are not very effective for 
calculating the loops with a 7^ insertion. In Fig. [7] we make a similar comparison but using the truncated solver 
method. We show the results as a function of the number of low precision noise vectors used. These results were 
obtained using of the order of 10'^ high precision noise vectors, or about 2% of the largest number of low precision 
vectors used. As can be seen, convergence is achieved at much lower cost since, although the number of low precision 
vectors used is of the same order as the number of inversions needed for the exact evaluation, the cost in the former 
case is much lower. Therefore the truncated solver method is by far the best choice for the fermion loops entering the 
evaluation of the electromagnetic form factors. 
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FIG. 8: Results on the connected (top) and disconnected (bottom) part for the electric form factor. The exact result is shown 
with the filled (red) squares. The results are compared to spin dilution for 3000 noise vectors (or 12000 inversions) shown with 
the filled (blue) triangles and with the truncated solver method shown with the filled (green) circles. 
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FIG. 9: Results on the connected (top) and disconnected (bottom) part of the magnetic form factor. The notation is the same 
as that in Fig. [H] 



In Figs. [5] and [H] we show the results for the electric and magnetic form factors corresponding to the connected 
and disconnected contributions. As already noted, the results obtained using spin dilution have not fully converged 
whereas the results obtained using the truncated solver method are fully consistent with the exact evaluation. The 
fermion loops entering in the determination of the electromagnetic form factors are very noisy and their contribution 
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at this pion mass is small. A similar conclusion is reached for the nucleon strange form factors in the study of Ref. jll| 
where the fermion loops have similar quark mass as ours. 

V. NUCLEON SCALAR FORM FACTORS 

Another important quantity where fermion loops may contribute significantly is the nucleon cr-term given by 

(Ti ^ rni{N\uu + dd\N), m; = ^ (m^ + rTi^) . (23) 

This is proportional to the nucleon matrix element of the scalar quark density at — 0. An equivalent quantity can 
be defined for the strange quark density. The precise knowledge of these quantities are crucial as their value affects the 
magnitude of the dark matter cross sections on nuclear targets. Currently the uncertainty on their values represents 
the largest single uncertainty affecting the cross sections relevant in various super-symmetric models. It is therefore 
of the utmost importance to minimize the error on the a- terms. In this work we focus on the light quark sigma term 
CT;, which is extracted from a chiral analysis of low energy pion- nucleus scattering data. However, phenomenological 
analyses give somewhat different resul ts |26j . For example an earlier analysis gave ai = 45(8) MeV [l^], whereas a 
more recent one gave ai = 64(7) MeV |28| . 
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FIG. 10: Comparison of the three dilution methods (color, spin, odd-even) (top) and the truncated solver method (bottom) 
for the scalar current. On the left panel we show the error and on the right panel the ratio given in Eq. 1171 In the case of the 
truncated solver method, we plot against the number of low-precision inversions, while the high precision inversions are fixed 
to about 2% of the maximum number of low-precision vectors used. 

An indirect method to extract the nucleon cr-terms from lattice QCD without computing the disconnected diagram 
is to evaluate the dependence of the nucleon mass on the light and strange quark mass (2^ [30l - [35| . However, recently 
stochastic methods have been applied to evaluate directly the nucleon cr-terms 0, [ll| . Therefore it is interesting to 
compare the various approaches. In Fig. 1101 we perform a comparison of the commonly used dilution schemes for the 
evaluation of the scalar operator. As can be seen, the convergence for this operator is much better than in the case 
of the electromagnetic current, presented in the previous section. Namely, for the scalar case, we converge to the 
exact value with as little as 500 noise vectors as compared to the case of the electromagnetic current where even using 
as much as 12 000 noise vectors, which corresponds to about 1/4 of the number of inversions needed for the exact 
evaluation, the stochastic error has not converged. Therefore it is a matter of taste which dilution scheme one employs 
in the evaluation of the nucleon matrix element of the scalar quark density and a-term. The truncated solver method 
convergences very fast, at only a fraction of the computational cost of the other methods, and proves also here to be 
the most efficient. In Fig. [Tl]we show results on the nucleon scalar density as a function of the momentum transfer 
square, both for the connected and disconnected contributions. As can be seen, the exact result for the disconnected 
contribution is reproduced using either noise vectors with spin dilution or the truncated solver method. 

In order to calculate cri one needs to evaluate, besides the connected and disconnected contributions, the quark 
mass mi. Using the axial Ward-Takahashi identity 
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FIG. 11: The connected (top) and the disconnected (bottom) contributions of the scalar form factor. The exact resuh is shown 
with the filled (red) squares. The results are compared to spin dilution for 500 noise vectors (or 2000 inversions) shown with 
the filled (blue) triangles and with the truncated solver method shown with the filled (green) circles. 
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FIG. 12: The nucleon cr-term as a function of the pion mass squared. The crosses show results derived from the values of 
the three-point function calculated in Ref. [29l ]. whereas the open circles are the results of this work. The line is a linear 
extrapolation to the chiral limit, giving the value shown by the filled triangle. The asterisk shows the value of ai extracted 
from a recent phenomenological analysis [l^l with a systematic error that shows the deviation of this value from an earlier 
phenomenological determination given in Ref. [2?! ]. 



we can extract the quark mass by taking the matrix element of Eq. (I24p between a zero momentum pion state and 
the vacuum: 

_ m.<0|AgK(0)> 
^9 - ~TTr7TT75;7T3r77vr^ • ^'^^) 



2 < 0|P''|7r°(0) > 



In order to obtained the renormalized quark mass one needs the renormalization constant for the axial- vector current 
Za and for the pseudoscalar current Zp, whereas for the scalar density one needs Zs, which can be taken from 
Ref. p4 |. However, the cr-term is renormalization group invariant and therefore no renormalization is needed. 
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We find that the cr-term due to the connected diagram is crp = 0.278(8) GeV and due to the disconnected af = 
0.224(82) GeV, giving ai = 0.50(8) GeV at pion mass ^ 750 MeV. This value is in agreement with an estimate made 
on the same ensemble in Ref. 29] if one uses their values of the connected and disconnected three-point function and 
the PCAC mass instead of the naive quark mass used there. In order to investigate the pion mass dependence of 
the cr-tcrm, we performed the calculation using the truncated solver method for the disconnected contribution at two 
lighter quark masses, namely at k = 0.1575 and k = 0.1580 on a lattice of size 24^ x 48 fTJ. The values of the quark 
mass for this set of k values are taken from Ref. [l^l- The results are shown in Fig. [T^ where we also included results 
from Ref. [1^ but using the values of the PCAC mass given in Ref. [13]. Extrapolating linearly in the quark mass 
one obtains a value at the physical point that is in agreement with phenomenological estimates of this quantity, albeit 
with a large statistical error. 

VI. CONCLUSIONS 

The focus of this study is to investigate the stochastic techniques that are commonly being applied to compute 
fermion loops by comparing to the exact evaluation. Therefore, we perform an exact evaluation using GPUs for a 
relatively small lattice of 16'^ x 32 and Nf = 2 Wilson fermions corresponding to a pion mass of about 750 MeV. 
We consider fermion loops with the '07mV' operator which are relevant for the electromagnetic current, flavor singlet 
operators relevant for the calculation of the 77' mass and scalar operators relevant for the calculation of the a-term. 

Comparing color, spin and two spatial types of dilution schemes the conclusion is that they perform similarly. For 
the scalar operator, the convergence of these dilution schemes is much faster and one needs an order of magnitude less 
noise vectors as compared to the number needed when there is a 7-insertion in the loop. Comparing the aforementioned 
dilution schemes with the truncated solver method the conclusion is that the latter is by far the most efficient. This 
conclusion holds for fermion loops involving light quarks. A possible extension of this work will be the calculation 
of all-to-all propagators involved in other processes as for example in the evaluation of three-point diagrams. Such a 
study was carried out in the case of the semileptonic form factor of the D-meson (36j and it would be interesting to 
examine similar techniques in the case of the nucleon. 
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